The goal of this project is to visualize ostrich eggshell (OES) data.

For this project, I have compiled a dataset from the literature of previously reported ancient ostrich eggshells. The data set contains as many reported OES details as possible including the coordinates in which they were found, country, region, their age, species, and their citation.

Let’s take a look at the data!

Here I load the packages I needed for my visualizations:

library(maps)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.4
## ✔ lubridate 1.9.3     ✔ stringr   1.5.1
## ✔ purrr     1.0.2     ✔ tibble    3.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ✖ purrr::map()    masks maps::map()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(sf)
## Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(rmapshaper)
library(mapview)
library(rnaturalearth)
library(mapproj)
library(ggrepel)
library(ggmap)
## ℹ Google's Terms of Service: <https://mapsplatform.google.com>
## ℹ Please cite ggmap if you use it! Use `citation("ggmap")` for details.
library(ggthemes)

Here I load in my data set titled OESdata which is saved into the DATA folder of this repo:

OESdata<- read_csv("https://raw.githubusercontent.com/jzkvlds/creative-data-visualization/main/DATA/OESdata.csv", col_names = TRUE)
## New names:
## Rows: 90 Columns: 22
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (19): Country, Region, Site, Cal age BP " 1s/ka, 14C age BP, Calculated ... lgl
## (3): Calendar Date (95.4%) (BC), Specimen Number * check (institution),...
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...21`
head(OESdata)
## # A tibble: 6 × 22
##   Country  Region Site  Calendar Date (95.4%…¹ `Cal age BP " 1s/ka` `14C age BP`
##   <chr>    <chr>  <chr> <lgl>                  <chr>                <chr>       
## 1 China: … <NA>   <NA>  NA                     <NA>                 <NA>        
## 2 China: … <NA>   <NA>  NA                     <NA>                 <NA>        
## 3 China    Yangw… <NA>  NA                     <NA>                 <NA>        
## 4 China    Yangw… <NA>  NA                     <NA>                 <NA>        
## 5 China, … <NA>   <NA>  NA                     <NA>                 <NA>        
## 6 China, … <NA>   <NA>  NA                     <NA>                 <NA>        
## # ℹ abbreviated name: ¹​`Calendar Date (95.4%) (BC)`
## # ℹ 16 more variables: `Calculated Age (1950 - # for BP) (BC)` <chr>,
## #   Epoch <chr>, `Other available age data from publications cited` <chr>,
## #   `What type of sample` <chr>, `Species Referal` <chr>,
## #   `Species Referal Revised` <chr>,
## #   `Specimen Number * check (institution)` <lgl>,
## #   `Internal Lab ID ** check` <chr>, …

Here I transform my data to a cleaner version that we can work with. I do this by removing a few columns we do not need for this visualization.

(I decided to delete one column at a time, then check the column names and asses if and which columns to delete next, and repeated these steps.)
colnames(OESdata)
##  [1] "Country"                                         
##  [2] "Region"                                          
##  [3] "Site"                                            
##  [4] "Calendar Date (95.4%) (BC)"                      
##  [5] "Cal age BP \" 1s/ka"                             
##  [6] "14C age BP"                                      
##  [7] "Calculated Age (1950 - # for BP) (BC)"           
##  [8] "Epoch"                                           
##  [9] "Other available age data from publications cited"
## [10] "What type of sample"                             
## [11] "Species Referal"                                 
## [12] "Species Referal Revised"                         
## [13] "Specimen Number * check (institution)"           
## [14] "Internal Lab ID ** check"                        
## [15] "details of location/environment"                 
## [16] "Cite"                                            
## [17] "This paper cited..."                             
## [18] "Comments"                                        
## [19] "LAT"                                             
## [20] "LONG"                                            
## [21] "...21"                                           
## [22] "Dating Method"
CleanData <- OESdata[, -17]

colnames(CleanData)
##  [1] "Country"                                         
##  [2] "Region"                                          
##  [3] "Site"                                            
##  [4] "Calendar Date (95.4%) (BC)"                      
##  [5] "Cal age BP \" 1s/ka"                             
##  [6] "14C age BP"                                      
##  [7] "Calculated Age (1950 - # for BP) (BC)"           
##  [8] "Epoch"                                           
##  [9] "Other available age data from publications cited"
## [10] "What type of sample"                             
## [11] "Species Referal"                                 
## [12] "Species Referal Revised"                         
## [13] "Specimen Number * check (institution)"           
## [14] "Internal Lab ID ** check"                        
## [15] "details of location/environment"                 
## [16] "Cite"                                            
## [17] "Comments"                                        
## [18] "LAT"                                             
## [19] "LONG"                                            
## [20] "...21"                                           
## [21] "Dating Method"
#we checked that that worked so we want to remove another column
CleanData <- CleanData[,-17]

colnames(CleanData)
##  [1] "Country"                                         
##  [2] "Region"                                          
##  [3] "Site"                                            
##  [4] "Calendar Date (95.4%) (BC)"                      
##  [5] "Cal age BP \" 1s/ka"                             
##  [6] "14C age BP"                                      
##  [7] "Calculated Age (1950 - # for BP) (BC)"           
##  [8] "Epoch"                                           
##  [9] "Other available age data from publications cited"
## [10] "What type of sample"                             
## [11] "Species Referal"                                 
## [12] "Species Referal Revised"                         
## [13] "Specimen Number * check (institution)"           
## [14] "Internal Lab ID ** check"                        
## [15] "details of location/environment"                 
## [16] "Cite"                                            
## [17] "LAT"                                             
## [18] "LONG"                                            
## [19] "...21"                                           
## [20] "Dating Method"
CleanData <- CleanData[,-19]
colnames(CleanData)
##  [1] "Country"                                         
##  [2] "Region"                                          
##  [3] "Site"                                            
##  [4] "Calendar Date (95.4%) (BC)"                      
##  [5] "Cal age BP \" 1s/ka"                             
##  [6] "14C age BP"                                      
##  [7] "Calculated Age (1950 - # for BP) (BC)"           
##  [8] "Epoch"                                           
##  [9] "Other available age data from publications cited"
## [10] "What type of sample"                             
## [11] "Species Referal"                                 
## [12] "Species Referal Revised"                         
## [13] "Specimen Number * check (institution)"           
## [14] "Internal Lab ID ** check"                        
## [15] "details of location/environment"                 
## [16] "Cite"                                            
## [17] "LAT"                                             
## [18] "LONG"                                            
## [19] "Dating Method"
CleanData <- CleanData[,-16]
colnames(CleanData)
##  [1] "Country"                                         
##  [2] "Region"                                          
##  [3] "Site"                                            
##  [4] "Calendar Date (95.4%) (BC)"                      
##  [5] "Cal age BP \" 1s/ka"                             
##  [6] "14C age BP"                                      
##  [7] "Calculated Age (1950 - # for BP) (BC)"           
##  [8] "Epoch"                                           
##  [9] "Other available age data from publications cited"
## [10] "What type of sample"                             
## [11] "Species Referal"                                 
## [12] "Species Referal Revised"                         
## [13] "Specimen Number * check (institution)"           
## [14] "Internal Lab ID ** check"                        
## [15] "details of location/environment"                 
## [16] "LAT"                                             
## [17] "LONG"                                            
## [18] "Dating Method"
CleanData <- CleanData[,-15]
colnames(CleanData)
##  [1] "Country"                                         
##  [2] "Region"                                          
##  [3] "Site"                                            
##  [4] "Calendar Date (95.4%) (BC)"                      
##  [5] "Cal age BP \" 1s/ka"                             
##  [6] "14C age BP"                                      
##  [7] "Calculated Age (1950 - # for BP) (BC)"           
##  [8] "Epoch"                                           
##  [9] "Other available age data from publications cited"
## [10] "What type of sample"                             
## [11] "Species Referal"                                 
## [12] "Species Referal Revised"                         
## [13] "Specimen Number * check (institution)"           
## [14] "Internal Lab ID ** check"                        
## [15] "LAT"                                             
## [16] "LONG"                                            
## [17] "Dating Method"
CleanData <- CleanData[,-17]
colnames(CleanData)
##  [1] "Country"                                         
##  [2] "Region"                                          
##  [3] "Site"                                            
##  [4] "Calendar Date (95.4%) (BC)"                      
##  [5] "Cal age BP \" 1s/ka"                             
##  [6] "14C age BP"                                      
##  [7] "Calculated Age (1950 - # for BP) (BC)"           
##  [8] "Epoch"                                           
##  [9] "Other available age data from publications cited"
## [10] "What type of sample"                             
## [11] "Species Referal"                                 
## [12] "Species Referal Revised"                         
## [13] "Specimen Number * check (institution)"           
## [14] "Internal Lab ID ** check"                        
## [15] "LAT"                                             
## [16] "LONG"

Here I remove unwanted quotation marks that were showing up on the LAT and LONG columns:

removeQuotes <- function(x) gsub("\"", "", x)

CleanData <- CleanData %>%
    mutate_if(is.character, removeQuotes)

Here I change the name of the first column to “Location”:

colnames(CleanData)[1] <- "Location"

head(CleanData)
## # A tibble: 6 × 16
##   Location Region Site  Calendar Date (95.4%…¹ `Cal age BP " 1s/ka` `14C age BP`
##   <chr>    <chr>  <chr> <lgl>                  <chr>                <chr>       
## 1 China: … <NA>   <NA>  NA                     <NA>                 <NA>        
## 2 China: … <NA>   <NA>  NA                     <NA>                 <NA>        
## 3 China    Yangw… <NA>  NA                     <NA>                 <NA>        
## 4 China    Yangw… <NA>  NA                     <NA>                 <NA>        
## 5 China, … <NA>   <NA>  NA                     <NA>                 <NA>        
## 6 China, … <NA>   <NA>  NA                     <NA>                 <NA>        
## # ℹ abbreviated name: ¹​`Calendar Date (95.4%) (BC)`
## # ℹ 10 more variables: `Calculated Age (1950 - # for BP) (BC)` <chr>,
## #   Epoch <chr>, `Other available age data from publications cited` <chr>,
## #   `What type of sample` <chr>, `Species Referal` <chr>,
## #   `Species Referal Revised` <chr>,
## #   `Specimen Number * check (institution)` <lgl>,
## #   `Internal Lab ID ** check` <chr>, LAT <chr>, LONG <chr>

Now that my data is condensed we can start the Visualization process!

Here I create a map of Europe:

world <- map_data("world")

europe <- subset(world, region %in% c("Albania", "Andorra", "Armenia", "Austria", "Azerbaijan","Belarus", "Belgium", "Bosnia and Herzegovina", "Bulgaria", "Croatia", "Cyprus", "Czechia","Denmark","Estonia","Finland","France","Georgia", "Germany", "Greece","Hungary","Iceland","Ireland", "Italy","Kazakhstan", "Kosovo", "Latvia","Liechtenstein",  "Lithuania", "Luxembourg","Malta","Moldova","Monaco","Montenegro", "Macedonia", "Netherlands","Norway","Poland","Portugal","Romania",   "Russia","San Marino","Serbia","Slovakia","Slovenia","Spain",    "Sweden","Switzerland","Turkey","Ukraine","UK","Vatican"))

Here I plot the map of Europe I created:

ggplot(data = europe, aes(x = long, y = lat, group = group)) + 
  geom_polygon(fill = "white", color = "black") +
  theme_void()

Here I fix the coordinate ratios of my map of Europe:

ggplot(data = europe, aes(x = long, y = lat, group = group)) + 
  geom_polygon(fill = "white", color = "black") +
  theme_void() +
  coord_fixed(ratio=1.5, xlim = c(-15,180), ylim = c(35,80))

Here I create a map of Asia:

asia <- subset(world, region %in% c("Afghanistan", "Armenia", "Azerbaijan", "Bahrain", "Bangladesh", "Bhutan", 
"Brunei", "Cambodia", "China", "Cyprus", "Georgia", "India", "Indonesia", "Iran", "Iraq", "Israel", "Japan", "Jordan", "Kazakhstan", "Kuwait", "Kyrgyzstan", "Laos", "Lebanon", "Malaysia", "Maldives", "Mongolia", "Myanmar","Nepal", "North Korea", "Oman", "Pakistan", "Palestine", "Philippines", "Qatar", "Russia", "Saudi Arabia", "Singapore", "South Korea", "Sri Lanka", "Syria", "Taiwan", "Tajikistan", "Thailand", "Timor-Leste", "Turkey", "Turkmenistan", "United Arab Emirates", "Uzbekistan", "Vietnam", "Yemen"))

Here I plot the map of Asia:

ggplot(data = asia, aes(x = long, y = lat, group = group)) + 
  geom_polygon(fill = "white", color = "black") +
  coord_fixed(1.2) 

Here I remove the back grey color theme and the infill of the map of Asia and come back to the fixed coordinates:

ggplot(data = asia, aes(x = long, y = lat, group = group)) + 
  geom_polygon(fill = "white", color = "black") +
  coord_fixed(1.2) +
  theme_void()

Here I combine the data for Europe and Asia and save it as “eurasia”:

eurasia <- rbind(europe, asia)

Now that we’ve learned to make maps, let’s start adding my data points into the map!

Here I convert LAT and LONG to numeric:

CleanData$LONG <-as.numeric(as.character(CleanData$LONG))
CleanData$LAT  <-as.numeric(as.character(CleanData$LAT))
## Warning: NAs introduced by coercion

Here I remove rows that contained NA’s in the LAT and LONG columns then save this as a new dataset CData:

CData <- CleanData[!is.na(CleanData$LONG),]
CData <- CleanData[!is.na(CleanData$LAT),]

Here I visualize with the map of asia and also add my data points in black:

#Here I visualize with a black infill and change the size

ggplot() +
  geom_map(
    data = asia, map = world,
    aes(x= long, y= lat, map_id = region),
    color = "black", fill = "lightgray", size = 0.1) +
  geom_point(data = CData, aes(LONG, LAT))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning in geom_map(data = asia, map = world, aes(x = long, y = lat, map_id =
## region), : Ignoring unknown aesthetics: x and y

Here I plot the map of Eurasia adding my data points from my CData in red and removing background grey color by using theme_void:

ggplot(data = eurasia, aes(x = long, y = lat)) + 
  geom_polygon(fill = "white", color = "black") +
  coord_fixed(1.2) +
  theme_void() +
  geom_point(data=CData, aes(LONG,LAT), color="red")

Here we remove all of those lines that are being caused by geom_polygon, I remove theme_void() and save as “EuraiaMap”:

EurasiaMap <- ggplot() +
  geom_map(
    data = eurasia, map = world,
    aes(long, lat, map_id = region),
    color = "black", fill = "lightgray", size = 0.1) +
      geom_point(data=CData, aes(LONG,LAT), color="red", size = .5)
## Warning in geom_map(data = eurasia, map = world, aes(long, lat, map_id =
## region), : Ignoring unknown aesthetics: x and y
EurasiaMap

Here we add x and y labels and save as EurasiaMap:

EurasiaMap <- ggplot() +
  geom_map(
    data = eurasia, map = world,
    aes(long, lat, map_id = region),
    color = "black", fill = "lightgray", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5) +
  labs ( x = "Longitude", y = "Latitude")
## Warning in geom_map(data = eurasia, map = world, aes(long, lat, map_id =
## region), : Ignoring unknown aesthetics: x and y
EurasiaMap

Most currently I’ve been focusing on Mongolia and China so here I make a map for only these two countries and title it “china”: (Just for fun!)

china <- subset(world, region %in% c("China", "Mongolia"))

 ggplot() +
  geom_map(data = china, map= world,
           aes(long, lat, map_id = region),
           color = "black", fill = "lightgray", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(1.2)
## Warning in geom_map(data = china, map = world, aes(long, lat, map_id =
## region), : Ignoring unknown aesthetics: x and y

Here I add a title to the map:

 ggplot() +
  geom_map(data = china, map= world,
           aes(long, lat, map_id = region),
           color = "black", fill = "lightgray", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(1.2) +
  ggtitle("MAP of ancient OES: focusing on China and Mongolia")
## Warning in geom_map(data = china, map = world, aes(long, lat, map_id =
## region), : Ignoring unknown aesthetics: x and y

Here I add the species referrals as labels on the location in which their eggshells were found:

 ggplot() +
  geom_map(data = china, map= world,
           aes(map_id = region),
           color = "black", fill = "lightgray", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(1.2) +
  ggtitle("MAP of ancient OES in China and Mongolia") +
  geom_text(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text()`).

Here we change the map from china to eurasia making sure we change the title of the map. We also make sure the labels text do not overlap by adding geom_text_repel from ggrepel package:

 ggplot() +
  geom_map(data = eurasia, map= world,
           aes(map_id = region),
           color = "black", fill = "lightgray", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(1.2) +
  ggtitle("MAP of ancient OES: focusing on Eurasia") +
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 36 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

As we can see there are a bunch of floating red dots, let’s add Africa for them!

Here I make a map of Africa and save it as “africa”:

africa <- subset(world, region %in% c("Algeria", "Angola", "Benin", "Botswana", "Burkina Faso", "Burundi", "Cabo Verde", "Cameroon", "Central African Republic", "Chad", "Comoros", "Democratic Republic of the Congo", "Djibouti", "Egypt", "Equatorial Guinea", "Eritrea", "Eswatini", "Ethiopia", "Gabon", "Gambia", "Ghana", "Guinea", "Guinea-Bissau", "Ivory Coast", "Kenya", "Lesotho", "Liberia", "Libya", "Madagascar", "Malawi", "Mali", "Mauritania", "Mauritius", "Morocco", "Mozambique", "Namibia", "Niger", "Nigeria", "Republic of the Congo", "Rwanda", "Sao Tome and Principe", "Senegal", "Seychelles", "Sierra Leone", "Somalia", "South Africa", "South Sudan", "Sudan", "Tanzania", "Togo", "Tunisia", "Uganda", "Zambia", "Zimbabwe"))

Here I combine my europe, asia, and africa maps and save it as “fullmap”:

fullmap <- rbind(europe, asia, africa)

Here I plot my OES data on the full map, I also make sure to change the title, and zoom out.

 ggplot() +
  geom_map(data = fullmap, map= world,
           aes(map_id = region),
           color = "black", fill = "lightgray", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 50 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Now that we understand my basic data the fun visualization begins!

Lets go back and make our countries some funky colors by continents.

(I chose to use the color “palegreen4” for European countries, “deeppink4” for African countries, and “yellowgreen” for Asian countries.)

# create a colorful "world"
worldcolor <- mutate(world, fill = case_when(
  region %in% c("Albania", "Andorra", "Armenia", "Austria", "Azerbaijan","Belarus", "Belgium", "Bosnia and Herzegovina", "Bulgaria", "Croatia", "Cyprus", "Czechia","Denmark","Estonia","Finland","France","Georgia", "Germany", "Greece","Hungary","Iceland","Ireland", "Italy","Kazakhstan", "Kosovo", "Latvia","Liechtenstein",  "Lithuania", "Luxembourg","Malta","Moldova","Monaco","Montenegro", "Macedonia", "Netherlands","Norway","Poland","Portugal","Romania",   "Russia","San Marino","Serbia","Slovakia","Slovenia","Spain",    "Sweden","Switzerland","Turkey","Ukraine","UK","Vatican") ~ "palegreen4", 
  region %in% c("Algeria", "Angola", "Benin", "Botswana", "Burkina Faso", "Burundi", "Cabo Verde", "Cameroon", "Central African Republic", "Chad", "Comoros", "Democratic Republic of the Congo", "Djibouti", "Egypt", "Equatorial Guinea", "Eritrea", "Eswatini", "Ethiopia", "Gabon", "Gambia", "Ghana", "Guinea", "Guinea-Bissau", "Ivory Coast", "Kenya", "Lesotho", "Liberia", "Libya", "Madagascar", "Malawi", "Mali", "Mauritania", "Mauritius", "Morocco", "Mozambique", "Namibia", "Niger", "Nigeria", "Republic of the Congo", "Rwanda", "Sao Tome and Principe", "Senegal", "Seychelles", "Sierra Leone", "Somalia", "South Africa", "South Sudan", "Sudan", "Tanzania", "Togo", "Tunisia", "Uganda", "Zambia", "Zimbabwe") ~ "deeppink4",
   region %in% c( "Afghanistan", "Armenia", "Azerbaijan", "Bahrain", "Bangladesh", "Bhutan", 
"Brunei", "Cambodia", "China", "Cyprus", "Georgia", "India", "Indonesia", "Iran", "Iraq", "Israel", "Japan", "Jordan", "Kazakhstan", "Kuwait", "Kyrgyzstan", "Laos", "Lebanon", "Malaysia", "Maldives", "Mongolia", "Myanmar","Nepal", "North Korea", "Oman", "Pakistan", "Palestine", "Philippines", "Qatar", "Russia", "Saudi Arabia", "Singapore", "South Korea", "Sri Lanka", "Syria", "Taiwan", "Tajikistan", "Thailand", "Timor-Leste", "Turkey", "Turkmenistan", "United Arab Emirates", "Uzbekistan", "Vietnam", "Yemen") ~ "yellowgreen",
  TRUE ~ "white"))

#create a colorful Africa named "africacolor"


africacolor <- subset(worldcolor, region %in% c("Algeria", "Angola", "Benin", "Botswana", "Burkina Faso", "Burundi", "Cabo Verde", "Cameroon", "Central African Republic", "Chad", "Comoros", "Democratic Republic of the Congo", "Djibouti", "Egypt", "Equatorial Guinea", "Eritrea", "Eswatini", "Ethiopia", "Gabon", "Gambia", "Ghana", "Guinea", "Guinea-Bissau", "Ivory Coast", "Kenya", "Lesotho", "Liberia", "Libya", "Madagascar", "Malawi", "Mali", "Mauritania", "Mauritius", "Morocco", "Mozambique", "Namibia", "Niger", "Nigeria", "Republic of the Congo", "Rwanda", "Sao Tome and Principe", "Senegal", "Seychelles", "Sierra Leone", "Somalia", "South Africa", "South Sudan", "Sudan", "Tanzania", "Togo", "Tunisia", "Uganda", "Zambia", "Zimbabwe"))

#create a colorful Asia named "asiacolor"

asiacolor <- subset(worldcolor, region %in% c("Afghanistan", "Armenia", "Azerbaijan", "Bahrain", "Bangladesh", "Bhutan", 
"Brunei", "Cambodia", "China", "Cyprus", "Georgia", "India", "Indonesia", "Iran", "Iraq", "Israel", "Japan", "Jordan", "Kazakhstan", "Kuwait", "Kyrgyzstan", "Laos", "Lebanon", "Malaysia", "Maldives", "Mongolia", "Myanmar","Nepal", "North Korea", "Oman", "Pakistan", "Palestine", "Philippines", "Qatar", "Russia", "Saudi Arabia", "Singapore", "South Korea", "Sri Lanka", "Syria", "Taiwan", "Tajikistan", "Thailand", "Timor-Leste", "Turkey", "Turkmenistan", "United Arab Emirates", "Uzbekistan", "Vietnam", "Yemen"))

#create a colorful Europe named "europecolor"


europecolor <- subset(worldcolor, region %in% c("Albania", "Andorra", "Armenia", "Austria", "Azerbaijan","Belarus", "Belgium", "Bosnia and Herzegovina", "Bulgaria", "Croatia", "Cyprus", "Czechia","Denmark","Estonia","Finland","France","Georgia", "Germany", "Greece","Hungary","Iceland","Ireland", "Italy","Kazakhstan", "Kosovo", "Latvia","Liechtenstein",  "Lithuania", "Luxembourg","Malta","Moldova","Monaco","Montenegro", "Macedonia", "Netherlands","Norway","Poland","Portugal","Romania",   "Russia","San Marino","Serbia","Slovakia","Slovenia","Spain",    "Sweden","Switzerland","Turkey","Ukraine","UK","Vatican"))


#Now we rbind the colorful maps into a "fullmapcolor"
fullmapcolor <- rbind(europecolor, asiacolor, africacolor)

Now I plot the colorful map:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()

It works!

Let’s make this more interesting and convert the dots into triangles:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5, shape = 24) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()

Now I remove the theme and add species referal labels

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5, shape = 24) +
  labs ( x = "Longitude", y = "Latitude") + 
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()+
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`)) +
  theme_void()
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 48 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Here I center the map title:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT), color="red", size = .5, shape = 24) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()+
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`)) +
  theme_void() +
    theme(plot.title = element_text(hjust = 0.5))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 48 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

The map is pretty but we need MORE!

Here I remove questions marks in my Location column:

CData$Location <- gsub("\\?", "", CData$Location)

Here I plot my map and make sure OES datapoints are represented by a different color for each Location and change them to “x”:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = Location), size = .5, shape = 4) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()+
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`)) +
  theme_void() +
    theme(plot.title = element_text(hjust = 0.5))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 67 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Here I make the legend smaller:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = Location), size = .5, shape = 4) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()+
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`)) +
  theme_void()   +
    theme(plot.title = element_text(hjust = 0.5))+
  theme(legend.key.size = unit(0.4, "lines"))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 67 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Here I change the position of the legend to the bottom of the map and make it’s lines even smaller:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = Location), size = .5) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()+
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`)) +
  theme_void()  +
  theme(legend.position = "bottom", legend.key.size = unit(0.1, "lines")) +
    theme(plot.title = element_text(hjust = 0.5))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 50 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

What a beautiful and colorful map,

but what if we want to color the points by Epoch? AND also change the shape!

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = Epoch), size = .5, shape = 13) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES (by Epoch)") +
  scale_fill_identity()+
  geom_text_repel(
    data= CData,
  aes( x = LONG, y = LAT, label= `Species Referal`)) +
  theme_void() +
  theme(legend.position = "bottom", legend.key.size = unit(0.1, "lines")) +
    theme(plot.title = element_text(hjust = 0.5))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
## Warning: ggrepel: 48 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

# Here I remove the species referral labels:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = Epoch), size = .5, shape = 13) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 80), ratio = 3/2)+
  ggtitle("MAP of ancient OES (by Epoch)") +
  scale_fill_identity()+
  theme_void() +
  theme(legend.position = "bottom", legend.key.size = unit(0.1, "lines")) +
    theme(plot.title = element_text(hjust = 0.5))

#Here I zoom into the map so focus more on where OES are found

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = Epoch), size = .5, shape = 13) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 50), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity()+
  theme_void() +
  theme(legend.position = "bottom", legend.key.size = unit(0.1, "lines")) +
    theme(plot.title = element_text(hjust = 0.5))

Lastly, I would like to visualize the datapoints as bigger stars with colors representing the species referrals.

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = `Species Referal`), size = 1, shape = 8) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-90, 200), ylim = c(-30, 50), ratio = 3/2)+
  ggtitle("MAP of ancient OES") +
  scale_fill_identity() +
  theme_void() +
    theme(plot.title = element_text(hjust = 0.5))+theme(legend.position = "bottom", legend.key.size = unit(0.1, "lines")) +
    theme(plot.title = element_text(hjust = 0.5))

And if we want to get a sense of where we are in terms of latitude and longitude, let’s add that background in:

ggplot() +
  geom_map(data = fullmapcolor, map= worldcolor,
           aes(map_id = region, fill = fill),
           color = "black", size = 0.1) +
 geom_point(data=CData, aes(x = LONG, y = LAT, color = `Species Referal`), size = 1, shape = 8) +
  labs ( x = "Longitude", y = "Latitude") +
  coord_fixed(xlim = c(-80, 180), ylim = c(-30, 50), ratio = 3/2)+
  ggtitle("MAP of ancient OES (by species)") +
  scale_fill_identity() +
    theme(plot.title = element_text(hjust = 0.5))+theme(legend.position = "bottom", legend.key.size = unit(0.1, "lines")) +
    theme(plot.title = element_text(hjust = 0.5))

Hope you enjoyed!